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Abstract 

A service system with multiple types of customers, arriving according to Poisson processes, is consid¬ 
ered. The system is heterogeneous in that the servers also can be of multiple types. Each customer has 
an independent exponentially distributed service time, with the mean determined by its type. Multiple 
customers (possibly of different types) can be placed for service into one server, subject to “packing” 
constraints, which depend on the server type. Service times of different customers are independent, even 
if served simultaneously by the same server. The large-scale asymptotic regime is considered such that 
the customer arrival rates grow to infinity. 

We consider two variants of the model. For the infinite-server model, we prove asymptotic optimality 
of the Greedy Random (GRAND) algorithm in the sense of minimizing the weighted (by type) number 
of occupied servers in steady-state. (This version of GRAND generalizes that introduced in [T^ for the 
homogeneous systems, with all servers of same type.) We then introduce a natural extension of GRAND 
algorithm for finite-server systems with blocking. Assuming subcritical system load, we prove existence, 
uniqueness, and local stability of the large-scale system equilibrium point such that no blocking occurs. 

This result strongly suggests a conjecture that the steady-state blocking probability under the algorithm 
vanishes in the large-scale limit. 

Keywords: Queueing networks. Stochastic bin packing. Heterogeneous service systems. Packing constraints. 
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1 Introduction 


We consider a heterogeneous service system where servers can be of multiple types. There are also multiple 
types of customers, each arriving according to an independent Poisson process. Each customer has an inde¬ 
pendent exponentially distributed service time, with the mean determined by its type. Multiple customers 
(possibly of different types) can be placed for service into one server, subject to “packing” constraints, which 
depend on the server type. Service times of different customers are independent, even if served simultane¬ 
ously by the same server. Such a system arises, for example, as a model of dynamic real-time assignment of 
virtual machines (“customers”) to physical host machines (“servers”) in a network cloud [B] , where typical 
objectives may be to minimize the number of occupied (non-idle) hosts or to minimize blocking/waiting of 
virtual machines. In this paper we consider two variants of the system, and study their properties in the 
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large-scale asymptotic regime, when the customer arrival rates (and then the number of occupied servers) 
are large. 

The first variant of the system is such that there is an infinite “supply” of servers of each type. Each arriving 
customer is assigned to a server immediately upon arrival. The asymptotic regime is considered such that 
the customer arrival rates grow in proportion to a scaling parameter r —^ cx). Each server type s is assigned 
a weight (“cost”) 7 ^, and the objective is to minimize the weighted number (“total cost”) of occupied servers 
in steady-state. We prove that a generalized version of the Greedy Random (GRAND) algorithm, introduced 
in m for a homogeneous system (with one server type), is asymptotically optimal, in the sense described 
below in this paragraph. The basic idea of GRAND is to assign an arriving customer of a given type i to a 
server chosen randomly uniformly among servers available to it, i.e. those servers where a type i customer 
can be added without violating packing constraints. A particular GRAND algorithm that we consider for 
the infinite server system, which is labeled GRAND(aZ), is as follows. There is a parameter Og > 0 for 
each server type s; a = (og) is the vector with components Og. An arriving customer picks uniformly at 
random an available server among all currently occupied servers plus designated numbers asZ of idle servers 
(called “zero-servers”) of each type s, where Z is the current total number of all customers. (GRAND(aZ) 
algorithm of [15] is a special case of GRAND(aZ), with single parameter a > 0, because there is only one 
server type.) GRAND(aZ) achieves optimality if we first take the limit of system stationary distributions 
as r ^ 00, and then take the limit on as = \. 0, with common parameter a 0. (We believe that a 

stronger form of asymptotic optimality, when only the limit r —> 00 is taken, holds for a different version of 
GRAND, with the number of zero-servers of type s equal to where parameter p < 1 is close to 

1. See Conjecture m at the end of Section [521) 

It is important to emphasize that GRAND(aZ) achieves asymptotic optimality without utilizing any knowl¬ 
edge of the system struetural parameters. Namely, the algorithm need not “know” the server types or exact 
states of the currently occupied servers. All it needs to know about each currently occupied server is whether 
or not it can “accept” an additional customer of type i, for each i. Note that the setting of the algorithm 
parameters Og, that achieves asymptotic optimality, depends only on the weights 7 g, which are the parame¬ 
ters of the objective (as opposed to system parameters). One of the key qualitative insights of [11] was the 
surprising fact that an algorithm as simple as GRAND can be asymptotically optimal. The fact that an 
appropriately generalized, but still extremely simple, version of GRAND is optimal for in a heterogeneous 
system, is still more surprising. 

The second variant is a system with finite size pools of servers of each type. Each arriving customer can be 
either immediately assigned to a server or immediately blocked (in which case it leaves the system without 
receiving service). The asymptotic regime is such that both the arrival rates and the server pool sizes scale 
in proportion to parameter r —>■ 00 . We consider a different version of the GRAND algorithm, labeled 
GRAND-F, which simply assigns each arriving customer randomly uniformly to any available to it server in 
the system, and blocks the customer if there are no such available servers. We study the dynamics of the 
fluid paths (obtained by “fluid” scaling and then the r —>■ cx) limit). Assuming the system is subcritically 
loaded, we prove existence, uniqueness and local stability of a system equilibrium point, such that there is 
no blocking. These results strongly suggest a conjecture that GRAND-F is asymptotically optimal in that, 
under subcritical load, the limit of the system stationary distributions is concentrated on the equilibrium 
point described above, and therefore the steady-state blocking probability vanishes in the r ^ 00 limit. We 
note that the equilibrium point local stability property is stronger than a typical “fixed point” argument, 
based on the assumption of asymptotic independence of server states (or, “independence ansatz,” in the 
terminology of mm)- The fixed point argument allows one to characterize (and then possibly derive) the 
limit of the stationary distributions, assuming the ansatz holds. If the ansatz is proved, this of course proves 
the limit of the stationary distributions. If the ansatz is not proved, the fixed point argument is equivalent 
to the property that the equilibrium point is an invariant point of the fluid paths. The local stability of the 
equilibrium point that we prove, is a stronger property than just its existence and invariance, and therefore 
it provides a stronger support for the asymptotic optimality conjecture. (The relation between the local 
stability and the fixed point argument is discussed in detail in Section [ATJ) 

We want to emphasize that the packing constraints that we consider are extremely general. (They are of the 
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same kind as those in [IMin]; we additionally allow them to depend on the server type.) In particular, they 
are far more general than vector packing constraints. Vector packing refers to the situation when a server 
has the corresponding resource-vector, giving the amounts of resources of different types that it possesses; 
for each customer type there is the requirement-vector, giving the resource requirements of one customer; 
the constraint is that the sum of the requirement-vectors of the customers placed into a server cannot exceed 
its resource-vector. Packing of virtual machines into physical machines in a network cloud [6] is an example 
of vector packing. 

Finally, we note that GRAND-F can be very efficiently implemented via a “pull-based” mechanism (see [TB] 
and references therein), which has a very low signaling message exchange rate between the “router” and 
the servers. In fact, GRAND-F algorithm can be viewed as an extension of PULL algorithm [16] to service 
systems with packing constraints. (This is discussed in more detail in Remark [6] in Section 12.31 ') 

1.1 Related previous work 

As mentioned above, the main practical motivation for our model is the problem of real-time dynamic assign¬ 
ment of virtual machines (VM) to physical host machines (PM) in a network cloud. (A general discussion 
of the issues that arise in this application can be found in |^.) Since multiple VMs can simultaneously 
occupy (be “packed into”) same PM, this naturally leads to bin packing type models. There is an extensive 
literature on the classical bin packing (see, e.g., [DHIH] for reviews and recent results), where each “item” 
(customer) once placed into a “bin” (server) stays in that bin forever. However, the dynamic VM-to-PM 
assignment problem is such that each VM (customer) leaves its PM (server), and the system, after its service 
is completed. This in turn naturally leads the models that we consider, i.e. service systems with packing 
constraints at the servers. 

The infinite-server variant of our model is a generalization of the homogeneous (one server type) model 
studied in [TSHlS], which focused on the problem of minimizing the number of occupied servers in steady- 
state. In particular, GRAND algorithm was proposed and shown to be asymptotically optimal in m- 
(Papers [CTIT^ have studied a different algorithm, which needs to know the structure of packing constraints 
and to use the exact current states of all servers.) Our model allows, in addition, multiple server types 
and we consider a more general problem of minimizing the weighted number of servers; the analysis of this 
variant of our model is a generalization of that in m- A homogeneous infinite-server model, specialized to 
vector packing constraints, was also considered in |5], where a randomized version of Best Fit algorithm was 
proved asymptotically optimal. 

The finite-server variant of our model is related to the model in recent paper m, which considers blocking in 
a homogeneous system, specialized to one-dimensional (single resource) vector packing constraints. (In [19] 
all servers are of the same type, and the term heterogeneous refers to multiple customer types, which our 
model also allows. So, in our terminology, the system in [19] is homogeneous.) The algorithm in [19] is of the 
power-of-d-choices type pilSl I 1211TB] . namely each arriving customer goes to the server which has the largest 
amount of unused resource, out of the d servers chosen uniformly at random. The paper uses a fixed point 
argument (independence ansatz) to derive the form of the equilibrium point, which is conjectured to be the 
asymptotic limit of the system steady-state. (In addition, the paper derives some performance bounds.) 
Of course, the equilibrium point under the power-of-d-choices algorithm is different from that under our 
GRAND-F algorithm. It is such that the blocking probability does not (and cannot be expected to) vanish 
in the limit. Therefore, the relation between the power-of-d-choices algorithm and GRAND-F for the systems 
with packing constraints, is analogous to the relation between power-of-d-choices and PULL algorithm |16j 
for service systems without packing, where the blocking (or waiting) probability vanishes under PULL, but 
not under the power-of-d-choices. (GRAND-F can be viewed as an extension of PULL algorithm to systems 
with packing constraints. See Remark|6]in Section 12.31 1 

Papers mm consider a homogeneous finite-server system with queues (and no blocking), and focus on the 
system stability (or, throughput maximization). In [7] a heterogeneous finite-server system is considered, with 
the objective of minimizing maximum load across server pools; the algorithms proposed in [7] essentially 
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treat the system as an infinite-server one. The algorithms in [ziiiniiii] are completely different from the 
variants of GRAND algorithm studied in this paper. 


1.2 Layout of the rest of the paper 

Basic notation used throughout the paper is given in Section 11.31 The model and the main results are 
stated in Section [2l The basic structure of the system, common to both variants, is given in Section 12.11 
The infinite-server system, GRAND(aZ’) algorithm and the main results for it (Theorems [2] and [H) are 
presented in Section [221 Section 2751 defines the finite-server system, GRAND-F algorithm, and states the 
main result for it informally in Proposition [8] (with formal statements given later in Lemmas and mi). 
Sections |3l and m contain proofs of the infinite-server/GRAND(aZ) results, while Section |5l contain those for 
finite-server/GRAND-F. Concluding remarks are given in Section [SI 


1.3 Basic notation 

Sets of real and real non-negative numbers are denoted by M and ]R_|_, respectively. We use bold and plain 
letters for vectors and scalars, respectively. The standard Euclidean norm of a vector x G R" is denoted by 
IIa:II. Convergence x ^ u € R.” means ordinary convergence in R", while x ^ U C R" means convergence 
to a set, namely, inf^gj/ ||a; — m|| —>■ 0. The f-th coordinate unit vector in R” is denoted by e^. Symbol 
denotes convergence in distribution of random variables taking values in space R” equipped with the Borel 
(T-algebra. The abbreviation w.p.l means convergence with probability 1. We often write x{-) to mean the 
function (or random process) {a;(t), t > 0}. Abbreviation u.o.c. means uniform on compact sets convergence 
of functions. The cardinality of a finite set M is |A/"|. Indicator function I{A] for a condition A is equal to 1 
if A holds and 0 otherwise. |"^] denotes the smallest integer greater than or equal to and denotes the 
largest integer smaller than or equal to f. For a finite set of scalar functions fn{t), t > 0, n € J\f, a point t 
is called regular if for any subset A/"' C A/" the derivatives ^ max„g 7 v// fn{t) and ^ min„g 7 v^/ /„(t) exist. 


2 Model and main results 


In this section we formally define the two variants of the model with heterogeneous servers, and state our 
main results for them. The first variant is a generalization of the infinite-server model in [IMS] in that we 
allow different types of servers, as opposed to just one type. The number of servers of each type is infinite 
and there is no blocking of arriving customers. For this version of the model the underlying objective is to 
minimize the weighted number of occupied servers in steady-state. The second variant is the model with 
different server types, but with finite number of servers of each type. If an arriving customer cannot be 
immediately assigned to some server in the system, it is blocked. In such a system, the underlying objective 
is to minimize blocking. Before defining these two variants of the model, in the next subsection we define 
the basic structure of the system (most importantly the server packing constraints), which is common for 
both model variants. 


2.1 Heterogeneous servers. Packing constraints 

We consider a service system with / types of customers, indexed byi G {1,2,...,/} =1. The service 
time of a type-f customer is an exponentially distributed random variable with mean 1/p.i. All customers’ 
service times are mutually independent. There are S types of servers, indexed s G {1,2, ...,5'} = 5, 
and infinite “supply” of servers of each type. A server of each type can potentially serve more than one 
customer simultaneously, subject to the following very general packing constraints. We say that a vector 
k = (/i, ... ,kj; s) with non-negative integer ki, i G I, and s G 5 is a server configuration, if a type s server 
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can simultaneously serve a combination of customers of different types given by the values ki. A configuration 
k with specific value of s is a type s server configuration. For any s, there is a finite set of all allowed type 
s server configurations, denoted by IC^. We assume that satishes a natural monotonicity condition: if 
k € iC^, then all “smaller” configurations k' = (fc^,..., fej; s), i.e. such that fc' < ki for all i, belong to as 
well. Without loss of generality, assume that for each i, (e^; s) G A® for at least one s, where is the i-th 
coordinate unit vector (otherwise, type-i customers cannot be served at all). By convention, for any s, vector 
0 ® = ( 0 ; s) € /C®, where fe = 0 is the /-dimensional component-wise zero vector - this is the configuration of 
an empty type s server. We denote by /C® = A® \ {0®} the set of type s server configurations not including 
the empty (or, zero) configuration. Denote by ^ = Ug^® and /C = Ug/C® the sets of all configurations and 
all non-zero configurations, respectively. In what follows, we use the following slightly abusive notations: for 
k G JC, k + Bi means vector k with ki replaced by -I-1, and similarly for k — Bi. 

An important feature of the model is that simultaneous service does not affect the service time distributions 
of individual customers. In other words, the service time of a customer is unaffected by whether or not there 
are other customers served simultaneously by the same server. A customer can be “added” to an empty or 
occupied server, as long as the packing constraints are not violated. Namely, a type i customer can be added 
to a server of type s whose current configuration fc G ^® is such that k + Bi G /C®. When the service of a 
type-z customer by a server in configuration k is completed, the customer leaves the system and the server’s 
configuration changes to k — Bi. 


2.2 Infinite-server system 

In this section we define the inhnite-server system, the proposed generalized GRAND (o^) assignment (or 
packing) algorithm, and state the asymptotic optimality results for this algorithm. 

We consider a system, as described in Section [2Al in which there is an infinite “supply” of servers of each type 
s G S. Customers of type i arrive as an independent Poisson process of rate A^ > 0; these arrival processes 
are independent of each other and of the customer service times. Each arriving customer is immediately 
placed for service in one of the servers, as long as packing constraints are not violated. 

Denote by the number of servers in configuration fc G /C®. The system state is then the vector X = 
{Afe, k G 1C}. 

A placement algorithm (or packing rule) determines where an arriving customer is placed, as a function of 
the current system state X. Under any well-defined placement algorithm, the process {X{t),t > 0} is a 
continuous-time Markov chain with a countable state space. It is easily seen to be irreducible and positive 
recurrent: the positive recurrence follows from the fact that the total number Yi{t) of type-i customers in the 
system is independent from the placement algorithm, and its stationary distribution is Poisson with mean 
Ai/fii] we denote by l)(oo) the random value of Yi{t) in steady-state - it is, therefore, a Poisson random 
variable with mean Kij\ii. Consequently, the process {A'(t), t > 0} has a unique stationary distribution; let 
A'(oo) = {Afc(cx)), k G K.} be the random system state X{t) in stationary regime. 

We are interested in finding a placement algorithm that minimizes the total weighted number of occupied 
servers in the stationary regime. 

Consider the following generalization of the Greedy-Random (GRAND) algorithm, introduced in [15]. More 
specifically, it is a generalization of the special form of the algorithm, called in [15] GRAND(a2'). 

Definition 1 (Greedy-Random (GRAND(aZ)) algorithm for heterogeneous infinite-server systems). The 
algorithm is parameterized by a vector a = (as, s G S) of real numbers Os > 0. Let Z(t) = kiXk{t) 

denote the total number of customers in the system at time t. At any given time t, there is a designated 
finite set of Xos(t) = \asZ(t)} > 0 empty type s servers, called s-zero-servers. 

A new customer, say of type i, arriving at time t is placed into a server chosen randomly uniformly among 
those zero-servers (of any type s) and occupied servers, where it can still fit. In other words, the total number 
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of servers available to a type-i arrival at time t is 




n xut) s 

kGiC: k+mSlC s-. eiGlC‘ 


Xqs (i) 


E 

kGfC: k-\-eiGK 


If (t) = 0, the customer is placed into an empty server of any type s such that £ /C®. 

The GRAND(aZ) algorithm is easily implementable. (A detailed discussion of the implementation issues of 
the GRAND algorithm is given below in Remark IH in the context of finite-server systems.) 

We now define the asymptotic regime. Let r —>■ cxd be a positive scaling parameter. More specifically, assume 
that r > 1, and r increases to infinity along a discrete sequence. Gustomer arrival rates scale linearly with 
r; namely, for each r, A^ = A^r, where Xi are fixed positive parameters. Let {X^{t), t > 0), be the process 
associated with a system with parameter r, and let A'’^(oo) be the (random) system state in the stationary 
regime. (Note that we do not include the zero-server numbers XQ,{t) into X^{t) = {Al^(t), k £ /C}.) For 
each i, denote by Yf(t) = J^keK total number of customers of type i. Since arriving customers 

are placed for service immediately and their service times are independent of each other and of the rest 
of the system, l^’’(oo) is a Poisson random variable with mean rpi, where pi = Xi/pi- Moreover, Yf(po) 
are independent across i. Since the total number of occupied servers is no greater than the total number of 
customers, J^kGic — Y^{t) = Y[{t), we have a simple upper bound on the total number of occupied 

servers in steady state, = Y^iYf {co), where Z’'(oo) is a Poisson random variable 

with mean r'^^pi. Without loss of generality, from now on we assume '^^Pi = 1- This is equivalent to 
rechoosing the parameter r to be r ^^ pi. 

The fluid-scaled process is x'^{t) = X^{t)/r, t £ [0, oo). We also define x^(oo) = X’'(oo)/r. For any r, x^(t) 
takes values in the non-negative orthant Similarly, y[(t) = Yf(t)/r, z^(t) = Z^(t)fr, XQ^(t) = XQ^(t)/r 

and x^.^(t) = X^.^(t)/r, for t > 0 and t = oo. Since J2keJC^ki^) — ■^’'(oo) = Z^{oo)/r, we see that the 
random variables (X^/ceK *fc(oo)) uniformly integrable in r. This in particular implies that the sequence 
of distributions of x'^{oo) is tight, and therefore there always exists a limit x{oo) in distribution, so that 
a;’’(oo) => x{oo), along a subsequence of r. 

The limit (random) vector x{oo) satisfies the following conservation laws: 

'^kiXkioo) = y^{oo) = p^, Vi, (1) 

keic 

and, in particular, 

^i(oo) = E 

i i 

Therefore, the values of a;(oo) are confined to the convex compact (|/C| — /)-dimensional polyhedron 

X = {x€ Rf' i E E ^ ^ 

s fce/co 

We will slightly abuse notation by using symbol x for a generic element of X; while x(oo), and later x(t), 
refer to random elements taking values in X. 

Also note that under GRAND(a2'), for any server type s, Xqs(oo) => a::o«(oo) = Oszipo') = Os, as r ^ oo. 

The asymptotic regime and the associated basic properties © and m hold for any placement algorithm. 
Indeed, © and (HI only depend on the already mentioned fact that all Yf{oo) are mutually independent 
Poisson random variables with means pir. 

Let the server weights > 0, s £ 5, be fixed. (One can think of 7 s as the “cost” rate of using one type s 
server.) Gonsider the following problem of minimizing the weighted number of occupied servers, on the fluid 
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scale: minx^x J2seS J2keiC‘ IsXk- It is a linear program: 


min jsXk, 

seSkeK‘’ 


( 3 ) 


subject to 

'^hxk=Pi, Vi. ( 4 ) 

k^K 

Without loss of generality, assume that the weights are scaled so that 71 = 1. Denote by A"* C A" the set of 
optimal solutions of ©-(H]). 

For future reference, we record the following observations and notation. Using the monotonicity of K,, it is 
easy to check that if in the LP ©-© we replace equality constraints © with the inequality constraints 


y^ hxk > Pi, Vi, ( 5 ) 

fee/c 

the new LP ©,© has same optimal value, and its set of the optimal solutions Af** contains X*, or more 
precisely, X* = X** n X. From here, using Kuhn-Tucker theorem, x G X* if and only if there exists a 
vector rj = { 77 ^, i G 1} of Lagrange multipliers, corresponding to the inequality constraints ©, such that 
the following conditions hold: 

xGX, (6) 

Pi > 0 , Vi € I, ( 7 ) 

^hPi<ls, kGlC\ ( 8 ) 

i 

for k G JC^, condition E kiPi < 7 s implies ccfe = 0. (9) 

i 

Vectors t] satisfying ©-© for some x G X are optimal solutions to the problem dual to LP ©,©. They 
form a convex set, which we denote by Fi*; it is easy to check that %* is compact. 

For each parameter-vector a (as in the definition of GRAND(aZ) algorithm), denote 


L(“)(a;) = y^ Xfclog[xfcCfc/(eas)], (10) 

s fee/co 


where Cfc = Hi 0! = 1- Then for k G JC^ we have 


{d/dxk)L^°'\x) = log[xfcCfc/as]. 

( 11 ) 

Note that if we adopt a convention that 


{d/dxo^)L^°-\x)\x„,,=as = 0 , 

( 12 ) 

then (fTTl) is valid for fc = 0® and = Os, which will be useful later. 


The function L^°‘\x) is strictly convex in a; G Consider the problem mina;gA: L^°‘'>{x). 

convex optimization problem: 

. It is the following 

min L^°'\x), 

(13) 

subject to 


^ ^ kiXk — Pi, Vi. 

(14) 


keK 
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Denote by x* °' G X its unique optimal solution. Using (fTTl) it is easy to check that > 0 for all k € K,. 
There exists a vector 1 /*’“ = i € T} oi Lagrange multipliers for the constraints (fHl) . such that a:*’“ 

solves problem 

min ^ 

* fceyc 

We see that \og[xlj^Ch/as] — = 0, G /C. Therefore, has the product form 


X 


*,a 

k 


ds 

= — exp 

C/c 



k G /CL 


(15) 


This in particular implies that the Lagrange multipliers v*'°' are unique and are equal to 

r'*’“ = log(a;*;“/as), by considering (IT5|) for e,, i G I; note also that they can have any sign (not necessarily 

non-negative). Therefore, we obtain the following fact. A point x X is the optimal solution to 

(that is X = x*'°‘) if and only if it has a product form representation D5I) for some vector u*’°'. (The ’only 

if’ part we just proved, and the ’if’ follows from Kuhn-Tucker theorem.) 

Our main results on the asymptotic optimality of GRAND(a.Z) algorithm for the system with infinite number 
of servers are the following Theorems [5] and [31 

Theorem 2. Let the parameter vector a be fixed. Consider a sequence of systems under the GRAND(aZ) 
algorithm, indexed by r, and let x^{oo) denote the random state of the fluid-scaled process in the stationary 
regime. Then, as r ^ oo, 

x’'(oo) X*'°'. 

Theorem 3. Suppose the parameter vector a itself depends on a single parameter a > 0 as follows: Og = 
a'^‘,s G S. Then, as a j, 0, a;*’“ — >■ X* and (— loga)"^!/*’® —;■ TL*. 

Theorems HI and [3] show that GRAND(aZ) is asymptotically optimal in the sense that x'^(oo) converges to 
the optimal set X*, if we first take the limit r —^ oo, and then take the limit a 4- 0 with Og = 

It was proved in a recent paper [17] (which is posterior to this paper) that a stronger form of asymptotic 
optimality, when only the limit r —>■ oo is taken, is achieved by the following version of GRAND, called 
GRAND (Z^’). This is a GRAND algorithm with the number of zero-servers depending on Z as Z^, where p < 
1 is a parameter, which is sufficiently close to 1, but depends only on the packing constraints. GRAND(Zp) 
can be informally interpreted as GRAND(aZ), with a being variable a = ZP~^ rather than constant. This 
suggests that for the heterogeneous infinite-server system that we consider, the stronger form of asymptotic 
optimality should hold, if we make Og variable, equal to Specifically, we believe that the methods 

of [17] can be extended to prove the following fact. 

Conjecture 4. Consider the GRAND algorithm with the number of zero-servers of type s equal to 

where parameter p < 1 is sufficiently close to 1, but depends only on the packing constraints (i.e., sets IC'^). 

Then, as r ^ oo, d{x'^(oo), X*) 0, where d{x, U) is the distance from point x to set U. 


2.3 Finite-server system 

We now consider a version of the system, where the number of servers of each type is finite. Namely, there 
is a finite number Hg > 0 of servers of type s. Gustomers of type i arrive as an independent Poisson process 
of rate A^ > 0 (and these processes are independent from the customer service times). Each arriving type i 
customer can be either immediately placed for service into one of the servers (subject to packing constraints) 
or immediately blocked, in which case it immediately leaves the system. If there is no server where an 
arriving customer can be placed, the customer is necessarily blocked. 

Let Xk denote the number of servers in configuration k G IC‘ and the system state is the vector X = 
{Xk, k G /C}. (Same notation as for the infinite-server system.) Note that we do not include the numbers 




Xqs of empty servers of each type (i.e., s-zero-servers) into the state X. However, those number are, of 
course, uniquely determined by X, because at all times we have the conservation law 

Xo. + ^ Xfc ^ Xfc = s G 5. 

k&IC‘ 

In such a system, a placement algorithm (or packing rule) determines, depending on the current system 
state X, whether or not an arriving customer is accepted (i.e., not blocked), and if so, into which server it 
is placed. (If there are no servers, where a customer can be placed, it is necessarily blocked.) Under any 
well-defined placement algorithm, the process {X(t),t > 0} is a continuous-time Markov chain with finite 
state space; it is easily seen to be irreducible and, therefore, ergodic, with unique stationary distribution. 
Let X(oo) = {Xk{oo),k G JC} be the random system state X{t) in stationary regime. It is also easy to 
see that Yi{oo) - the steady-state random number of all type i customers in the system - is stochastically 
dominated by that in the infinite-server system, i.e. by a Poisson random variable with mean Aijfii. 

For this system, the underlying objective is to minimize blocking in steady-state. We consider the following 
version of the Greedy-Random (GRAND) algorithm, for the finite-server systems. It will be labeled GRAND- 

F. 

Definition 5 (GRAND-F). A new customer, say of type i, arriving at time t is placed into a server chosen 
randomly uniformly among all servers in the system where it can still fit. (The total number of servers 
available for a type i customer addition at time t is 

A(,)(t) = ^ Afc(t). ) 

fcG/C: fc+e^G/C 

If there are no such available servers (i.e., A(q(t) =0), the eustomer is blocked. 

Remark 6. An implementation of GRAND-F algorithm only requires that the “router” (an entity, making an 
assignment decision for each arriving customer) knows which servers are currently available for an addition 
of a type i customer, for each i G T. The router does not need to know the exact configurations of the 
servers. Moreover, it does not even need to know the server types! Therefore, the router needs to maintain 
only / bits of information for each server. This in turn is easily achievable, for example, by using a pull-based 
mechanism, analogous to that used by the PULL algorithm proposed in m ( in a different context, for 
systems without non-trivial packing constraints). A specific pull-based mechanism to work in conjunction 
with GRAND-F can be as follows. 

(a) Upon a customer, say of type i, arrival, the router follows GRAND-F rule for choosing a server. If there 
are no available servers for type i, the customer is blocked and no further action is taken. If the customer 
is assigned to a server, the server availability state (/ bits) is changed to indicate the unavailability to any 
customer type i. 

(b) Each server, when its conhguration changes, i.e. upon any customer arrival (assignment) or departure 
(service completion), sends a “pull-message” (/ bits), containing its new availability state, to the router. 

(c) When router receives a pull-message from a server, it updates its availability status accordingly. (In 
reality, to prevent router from using “obsolete” pull-messages, after assigning a customer to a server, router 
can use some short time-out for the server, during which the server is considered unavailable regardless of its 
availability state. Thus, when the time-out expires, the availability state of the server is that from the latest 
pull-message received from it. If the time-out is longer than the “round-trip” router-server-router message 
delay, then the latest pull-message from the server is generated upon the last customer assignment to it, or 
maybe later, upon departures that occurred after that.) 

This mechanism is such that the rate of pull-messages in the system is very small, namely two pull-messages 
per each arriving customer. The low rate of communication between the router and the servers is a very 
important feature of pull-based algorithms, because in modern cloud based systems, the number of servers 
can be very large. 

We also note that a key part of the PULL algorithm is the random uniform assignment of customers to 
available servers. Therefore, GRAND-F algorithm can be viewed as an extension of PULL algorithm to 
service systems with packing constraints. 
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We consider the asymptotic regime, where the arrival rates are increased linearly with a scaling parameter 
r —> oo: Ai = A^r, where > 0 are fixed parameters. In addition, so do the server pool sizes iJg, namely, 
Hg = hgT, where hg > 0, s G S, are fixed parameters. 


Let be the process associated with a system with parameter r, and let X^{oo) be the (random) system 

state in the stationary regime. For each i, denote by the total number of customers 

of type i. As mentioned above, Yf (oo) is stochastically dominated by a Poisson random variable with mean 
rpi, where pi = Xi/pi. As before, without loss of generality, we assume Pi = 

The fluid-scaled process is x^{t) = X^(t)/r, t G [0, oo). We define x’'(oo) = X^(oo)/r. Similarly, y[(t) = 
Y[{t)lr, XQs{t) = XQs(t)/r and = A(’.^(t)/r, for t > 0 and t = oo. 

For any r, x^{t) takes values in the compact set 

={xG Rif I I X! ^ 

kGJC‘‘ 

For any x G , we denote xo^ = hg — s G S, and will sometimes use notation x = {xk, k G K}. 

The sequence of distributions of x'^ipo) is obviously tight, and therefore there always exists a limit x(oo) in 
distribution, so that x’'(oo) a:(oo), along a subsequence of r. The limit (random) vector x(oo) satisfies 

the following property w.p.l.: 

kiXkioo) = yi{oo) < Pi, Vi. (16) 

keK 

The asymptotic regime and property (fTHll obviously hold for any placement algorithm, not just GRAND-F. 


Consider the following subset of : 

= {X G X° \ Vi G X} = n A. 

S k^K^ 

We make the following 

Assumption 7. The system parameters Xi, pi, i G I, and hg, s G S, are such that the set A* in non-empty. 
Moreover, there exists x G A* such that xqs > 0 for all s. 


This assumption means that, when the scaling parameter r is large, and we have ptr customers of each type 
i, it is possible to “pack” all of them into the system servers (hgr for each type s), so that a non-zero fraction 
of servers in each pool s remains idle. Recall that, when r is large, pir is essentially the maximum number 
of type i customers the system can possibly have in steady state, because this would be the number of 
customers in the infinite-server system with no blocking. Thus, the assumption guarantees that it is feasible, 
at least in principle, to operate a system in a way such that, in the r ^ oo limit, the steady-state blocking 
probability vanishes. 


Consider the following function L°{x) defined on x such that x G A° (and xqs = hg — J^kGJC” for all s): 


L°{x) = ^ Xk\og[xkCk/e], 
keJC 


(17) 


where Ck =Y\iki\, 0! = 1. We then have 

{d/dxk)L°{x) = \og[xkCk\, kGJC. (18) 

For each k G K, the corresponding summand in the definition (CZD of function L°{x) is strictly convex in Xk', 
then, L°(x) is strictly convex on rI^L 


Consider the problem nuna;g;t® L°{x). It is the following convex optimization problem: 


min 


L°[x), 


(19) 
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subject to 


^ ^ kiXk Pi, V/, 

(20) 

fee/c 


Xk = hs, s € S. 

(21) 


k^K‘ 


Denote by a;*’° its unique optimal solution; of course, the corresponding x*’° G X^. Using (TT^ and 
Assumption [3 it is easy to see that > 0 for all k € JC. There exist a vector of Lagrange multipliers 
i/*’° = ( 1 /*’°, i G J) for the constraints (EU)) and Lagrange multipliers /3* for the constraints (1^ . such that 
solves problem 

min L°{x) + ^ ^ hxk) + Xk - hs). 

*£*+ ' i kGK s kGiC‘ 

We see that log[a;j^’°Cfe] — + /3s = ^ ^ Therefore, x*’° has the product form 


* □ f 

XjJ = — exp 

Cfc 




This in particular implies that Lagrange multipliers 
necessarily non-negative). 




L 1 

v*'° 1 /3*, are unique. 


, fe G /CA (22) 

They can have any sign (not 


We obtain the following fact. A point x, such that x G X^, is the optimal solution to U9\) - i21\) (that is 
X = x*’'^) if and only if it has a product form representation I122\) for some Lagrange multipliers , Pg. 
Furthermore, a:*’° and 1 /*’° are equal to x*'°‘ and respectively, defined for the infinite-server system in 
Section \KR with parameters = e~^'>. 


Our main result for the finite-server system is the following Proposition [5] (It is stated here informally. 
Formal statements are given in Lemmas [15] and da) 

Proposition 8. Suppose As sumption^ holds. As r — ^ 00 , the limits of the fluid-scaled trajectories x'^f) 
will be referred to as fluid sample paths (FSP). Point x G X'^ is an invariant point, if x{t) = x is an FSP. 
Then a:*’° is the unique invariant point x, such that xqs > 0 for all s (and therefore there is no blocking). 
Moreover, this invariant point is locally stable: x(t) —>■ x*'°, uniformly for all FSPs with a;(0) sufficiently 
close to x*’°. 


In turn, Proposition [5] strongly suggests that the following asymptotic optimality property holds, which we 
present as 

Conjecture 9. Suppose Assumption^ holds. Consider a sequence of systems under the GRAND-F algo¬ 
rithm, indexed by r, and let x'^{oo) denote the random state of the fluid-scaled process in the stationary 
regime. Then, as r —> 00 , a:’'(oo) 

If Conjectureiais correct, the GRAND-F algorithm is asymptotically optimal in the following sense. As long 
as Assumption |7| holds, i.e. the system has enough capacity to process all offered load (under ideal packing), 
then as r —>• 00 , the steady-state blocking probability under GRAND-F vanishes. As discussed in RemarklHl 
GRAND-F can be viewed as an extension of PULL algorithm [16]. Therefore, Gonjecture [9] if correct, can 
be viewed as an extension (to systems with packing constraints) of the asymptotic optimality of PULL. 


3 Proof of Theorem [3] 


For any fc G /C'*, as Og 0, 

[-logag]“^Xfclog[a:fcCfe/(eas)] - Xk = [- logag]“^a:fe[loga:fe -I- logCfe - 1] 0, 
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uniformly on any compact subset of non-negative Xk. We have 

L(“)(a;)/[-logai] = ^[-log as]/[-logoi] ^ [-log as]“^a:fc log[a;fcCfc/(eas)]. 
s feeK® 

Setting Gg = a'’'® (which implies [— logas]/[— logai] = 75/71 = 7s), we see that, as a 4, 0, {x)/[— log a] — 

Ss SfceiC® lsXk\ 0, uniformly in x & X . Therefore, x*'°‘ must converge to X*. 

Consider any sequence a 0. We will denote b = — log a. We will show that from any subsequence we can 
choose a further subsequence, along which we have convergence a:*’“ — >• x*^ v*'°’ jh —> rj*, where x* € X* 
and r]* Gn* . 

Let a subsequence of a be fixed. Since a:*’“ X*, we can and do choose a further subsequence along which 

x* °' —>■ X* for some fixed x* G X*. Let us show that 


limsupy^ < 7 s, Vfc €/C®, 

a ->0 


(23) 


liminf v*'"" jh > 0, VL 

a^O ® 


(24) 


From m we have: 


*,a ^ 

xG = — exp 
Ck 


HYI - 7s) 


fc e/C7 


(25) 


If (1^ would not hold for some k G , then by (1^ we would have limsupa;^’“ = 00 - a contradiction. 
Thus, (1^ holds. Suppose now that (1^ does not hold for some i, that is liminf < 0. Pick an s and 

k G IC^ such that fci > 1 and > 0. Such s and k must exist, because J2k — Pi (recall that x* G X*). 
Since x"^ G [0,pi], we see from (1251) that lim^^- kjV*'°' /b = 7 ®. Therefore, 


lim sup 


j¥^i 




= 7 ® — liminf jh > 7 ®; 


but, this violates (l23l) for configuration k — Bi. Thus, (l24l) holds. 

By (I23ll - (l24)l . the sequence of v* °- jb is bounded. Then, we choose a further subsequence along which jb 
converges to some rf. For the pair x* and rf, condition ([ 6 |) is automatic, conditions Q-dl]) follow from 
(1221)-(HH), and condition (O follows from (021) . Therefore, rj* gH*. □ 


4 Fluid sample paths for the infinite-server system 
under GRAND(aZ). Proof of Theorem [2] 


In this section, we define fluid sample paths (FSP) for the system controlled by GRAND(aZ). FSPs arise as 
limits of the (fluid-scaled) trajectories (l/r)X’'(-) as r —>• oo. Then we prove Theorem^ The development 
in this section is a generalization to the heterogeneous system of the definitions and results given for the 
homogeneous system in Section 4 of m- The generalization is quite straightforward. However, we provide 
it here for completeness and, more importantly, as a preparation for the related argument used later in 
Section [5] for the finite-server system. 

Let M denote the set of pairs {k,i) such that k G K. and k — Bi G IC. Each pair (fc,i) is associated with the 
“edge” {k — Bi,k) connecting configurations k — Bi and k; often we refer to this edge as {k,i). By “arrival 
along the edge {k,iy\ we will mean placement of a type i customer into a server configuration k — Bi to 
form configuration k. Similarly, “departure along the edge {k,i)” is a departure of a type-i customer from 
a server in configuration fc, which changes its configuration to k — Bi. 


12 






Without loss of generality, assume that the Markov process for each r is driven by the common set of 

primitive processes, defined as follows. 


For each (fe, i) € A4, consider an independent unit-rate Poisson process {nfci(t), t > 0}, which drives 
departures along edge {k,i). Namely, let denote the total number of departures along the edge {k,i) 

in [0,t]; then 


DUt) = Hfc, X],is)hfi,ds^ . 


The functional strong law of large numbers (FSLLN) holds: 

-nfci(rt) —)• t, U.O.C., w.p.l. 


(26) 


(27) 


For each i G I, consider an independent unit-rate Poisson process ni(t), t > 0, which drives exogenous 
arrivals of type i. Namely, let Al{t) denote the total number of type-f arrivals in [0,t], then 


Analogously to (l27l) . 


Ar(t) = n,(Art). 


-flArt) —>■ t, U.O.C., w.p.l. 
r 


(28) 


(29) 


The random placement of new arrivals is constructed as follows. For each i G I, consider an i.i.d. sequence 
^i(l), ^i(2),... of random variables, uniformly distributed in [0,1]. Denote by /C,; = {fc G ^ | fc -|- G A} the 
subset of those configurations (including zero configurations) which can fit an additional type-i customer. 
The configurations k G ICi are indexed by 1,2,..., |/Ci| (in arbitrary fixed order). When the m-th (in time) 
customer of type i arrives in the system, it is assigned as follows. If = 0, the customer is assigned to 
an empty server of an arbitrarily fixed type s, such that G /C®. Suppose > 1. Then, the customer is 
assigned to a server in configuration k' indexed by 1 if 

Um) G [0,A£,/AO)], 

it is assigned to a server in configuration k" indexed by 2 if 

C.(m) G (A^^/AO), (A^, + X],„)/Xy, 

and so on. Denote 

[r-trj 

m—1 

where cr>0,0<C<l- Obviously, from the strong law of large numbers and the monotonicity of gf(cr, (^) 
on both arguments, we have the FSLLN 


9iXX)^<^C, u.o.c. w.p.l 


(30) 


It is easy (and standard) to see that, for any r, w.p.l, the realization of the process {A’'(t), t > 0} is 
uniquely determined by the initial state A’^(O) and the realizations of the driving processes nfci(-), ni(-) and 

(e.(l),^^(2),...). 

If we denote by A^-(t) the total number of arrivals allocated along edge {k,i) in [0,t], we obviously have 
EfcsK:. t > 0, for each i. 

In addition to 

we introduce other fluid-scaled quantities: 

dLit) = -DL(t), alM = -AIM- 

r r 
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A set of locally Lipschitz continuous functions [{a;fc(-), A: G/C}, {(ifci(-), (fc, i) G Af}, {afei(-), (fc, i) G A4}] on 
the time interval [0, oo) we call a fluid sample path (FSP), if there exist realizations of the primitive driving 
processes, satisfying conditions (EZD,® and (1501) and a fixed subsequence of r, along which 

[K(-), fce/C},K,(-), {k,^)GMl{aU■), (fc,*)GAf}]^ 

[{a;fc(-), k € IC},{dki{-), (kfl) G M}, {aki{-), (kfl) G M}], u.o.c. (31) 

For any FSP, all points t > 0 are regular (see definition in Section [ESI), except a subset of zero Lebesgue 
measure. 

Lemma 10. Consider a sequence of fluid-scaled processes {x'^(t), t > 0} with fixed initial states x'^(0) such 
that x''{0) —a;(0). Then w.p.l, for any subsequence of r there exists a further subsequence ofr, along which 
the convergence holds, with the limit being an FSP. 


Proof is same as that of Lemma 5 in |15j . □ 


For an FSP, at a regular time point t, we denote Vkiit) = (d/dt)aki{t) and Wkflt) = (d/dt)dki(t). In other 
words, Vkift) and Wki(t) are the rates of type-* “fluid” arrival and departure along edge (k,i), respectively. 
Also denote: y^(t) = Y.k^i^k{t), z(t) = = asz{t), and X(i)(t) = J2k&K-.k+ei&K^k{t). 

Lemma 11. (i) An FSP satisfies the following properties at any regular point t: 


{d/dt)yi{f) = Xi- Pivflt), Vz G I, 

(32) 


(33) 

implies Vkiif) = y V(fe,z) G M, 

(34) 

^ ^ Vz G X, 

(35) 




{d/dt)xk{t) = 

^ Vkt{t) - 


— 

Wki{t) - 



i:k—e.i^K 

i:k+e.i^jC 


i:k — e.i^K 

i'.k+eti^jC 


Clearly, \3‘A] implies 

yi{t) = Pi +{yflC) - t > 0, VfGl. 

(ii) Moreover, an FSP with a;(0) G X satisfies the following stronger conditions: 


Vfc G 1C. 

(36) 

(37) 


at any regular point t, 


yi{f)=pi, CHgX, 

z{t) = l, xos{t) = as, X(^i){t) > ^ as,\/iGl; 

s: €.iGK^ 


^ki (0 


^k—e-i (^) ^ 

x{r){t) 


y{k,i) G M, 


^ ^ ^fci(t) — Aj, Vz G T. 
k:{k,i)£Ai 


(38) 

(39) 


(40) 

(41) 


Proof, (i) Given the convergence (ISTI) . which defines an FSP, all the stated properties except (l34l) are 
nothing but the limit versions of the flow conservations laws. Property (I34|) follows from the construction of 
the random assignment, the continuity of x{t), and (1301) . We omit further details. 

(ii) If a;(0) G X, which implies yi{0) = Pi for each i, property (155)) (and then (I55|) as well) follows from (1571) . 
Then, (1^ strengthens to (HOI) , and (HTl) is verified directly using (1551) . □ 
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Lemma 12. For any FSP with a;(0) € X, 


(42) 


x{t') —>■ 

and the convergence is uniform across all such FSPs. 


Proof. Given that xos{t) = Ug and < 1, we have < 1 + hence Vki{t) > Xk(t)Xi/{l + 

From here, we obtain the following fact: for any k and any <5 > 0 there exists di > 0 such that for all 
t > 6 , Xk(t) > di. The proof is by contradiction. Consider a fe, say fc G that is a minimal counterexample; 
necessarily, fc ^ 0®. Pick any i5 > 0 and then the corresponding i5i > 0 such that the statement holds for 
any fc' G k' < k. (Here k' <k means that k[ < ki, Vi, and k' ^ k.) We observe from (1551) that for any 
regular t > 6 , {d/dt)xk{t) > ^2 > 0 as long as Xk{t) < S 3 , for some positive constants < 52 , ^ 3 - Since this holds 
for an arbitrarily small d > 0 (with di,( 52,^3 depending on it), we see that the statement is true for k. 

In particular, we see that Xk{t) > 0 for all f > 0 and all k. Note also that all t > 0 are regular points 
(because all Wki and Vki are bounded continuous in x). 


To prove the lemma, it will suffice to show that: 

(a) if x{t) ^ x*’°' and Xfe(t) > 0 for all fc G /C, then {d/dt)L^°'\x{t)) < 0; and, moreover, 

(b) the derivative is bounded away from zero as long as ||a;(t) — is bounded away from zero. 

Let us denote by S(a;) the derivative {d/dt)L^°‘\x{t)) at a given point x{t) = a;; in the rest of the proof we 
study the function S(a;) on X, and therefore drop the time index t. Suppose all components Xk > 0. From 
(l33l) . (I35l) . (I40l) . and (gf]), we have: 


Wki — ki^iXk 


kifiiXk 


E 


k':{k'4)GM 


^k' —e-i 

®(i) 


(43) 


'^k'i 


^k'—e-i \ 

-A; 

X(i) 


^k'—Gi \ '' 7 

- > ki^iXk- 

X(A\ 

k-.{k,i)^M 


(44) 


Expressions (H51) and (IT41) can be interpreted as follows. For any ordered pair of edges (fe,i) and (fc^^), we 
can assume that the part ki^iXkXk'-ei/x(^i) of the total departure rate ki^LiXk along (fc, i) is “allocated back” 
as a part of the arrival rate along {k',i). Using (fTTI) . the contribution of these “coupled” departure/arrival 
rates for the ordered pair of edges {k,i) and {k',i) into the derivative 5(a;) is 


f,k,i 


^og{k[xk-fiiXk') - \og{hxkXk'-fiJ\ 


kifliXkXf^f — gj 

X(i) 


This expression is valid even when either fc — = 0® or fc^ — = 0® for some s. This is because xqs (t) = Ug 

when X € X, and therefore by convention (HU, formula m is valid for all fc G fci. We have: 


fk,k',i +fk'.k,i = idi/X{^))[log{klxk~eiXk') -log{hXkXk'-ei)][hXkXk'-ei “ ^Xk-e,Xk'] < 0, 
and the inequality is strict unless k'^Xk-e^Xk’ = kiXkXk 'We obtain 


“(®) 'y ^ 'y ', [fk,k',i T fk',k,i\- 

i k.k' 


(45) 


Therefore, ^{x) < 0 unless x has a product form representation (fTSl) . which in turn is equivalent to a; = x*’°‘. 

So far the function E.{x) in (g51) was defined for a; G T” with all Xk > 0. Let us adopt a convention that 
5(a:) = —oo ior x € X with at least one Xk = 0. Then, it is easy to verify that 5(a;) is continuous on the 
entire set X. 


It remains to show that for any ^2 > 0 there exists da > 0 such that conditions x € X and L^°‘\x) — 
(a;*’“) > 82 imply 5(a;) < —^ 3 . This is indeed true, because otherwise there would exist x G X, 
X ^ x*’°', such that 5(a;) = 0, which is, again, equivalent to a; = x* °'. □ 

From Lemma [T^ we easily obtain Theorem [2] see the proof of Theorem 3 in Section 4 of m- 
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As in we also have the following generalization of Lemma 1121 showing FSP uniform convergence for 
arbitrary initial states, not necessarily a:(0) G X. 

I K" I 

Lemma 13. For any compact A G ]R^_ , the convergence 

x{t) x*’°- (46) 

holds uniformly in all FSPs with a;(0) G A. 

Proof repeats that of Lemma 8 in |15j almost verbatim. The only adjustments are: 

1) Starting any fixed time t > 0, we have 0 < oi < a:o'>(t), Vs, and a:(i)(t) < 02 < 00, Vi, for some constants 
01 , 02 , uniformly on all FSPs with a;(0) G A; 

2) replaces L^); 

3) f{k) = {d/dxk)L^°'Hx) = log[a;fcCfc/as], fe G /CL □ 


5 GRAND-F: Local stability of FSPs 


The construction of the Markov process A’'(-) under GRAND-F is the same as in Section|3]for GRAND(aZ), 
except now, when = 0, an arriving type i customer is blocked. Consequently, we no longer have the 
identity J2ke>c- ^ > Oj for each i. Instead, 

Al{t) - Y, AL(t), t > 0, 

fce/Ci 

is non-negative non-decreasing function, giving the number of blocked type i customers by time t. 


The definition of an FSP and Lemma [TUI hold as is. All points t > 0 are regular, except for a subset of zero 
Lebesgue measure. The analog of Lemma [TT] is the following 

Lemma 14. (i) An FSP satisfies the following properties at any regular point t: 


Y '<^ki{t)<\i, V/Gl, 

{d/dt)yi{t)= Y^ Vki{t) - pLiyi{t), Vi G I, 
Wkift) = kifXiXkit), 'i{k,i)&M, 


(47) 

(48) 

(49) 


{d/dt)xk{t) = 



— 

Y, '^ki{t) - 



i-.k—e-i^K 

i:k-\-e.i^K 


i-.k — e-i^K 

i:k~\-e.i^K 


X(i)(t) > 0 implies 'Y Vki(t) = Xi, Vi G I, and Vkiit) = V(fc,i)GA4, (50) 

^ ~ ~ , VfcG/C. 

(51) 

(ii) Moreover, an FSP with a;(0) G X^, X(i^(0) > 0, Vi, satisfies the following stronger conditions for all 
sufficiently small t > 0: 


if t is regular, 


yi{t) = Pi, \fiGl, 

(52) 

z(t) = 1, XQsit) = Os, Vs, X(i)(t) > minos, Vi G I; 

(53) 

Vki{t) = y{k,i)GM, 

(54) 

k:{k,i)£Ai 

(55) 
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Proof, (i) Given the convergence (IHTl) defining an FSP, all the stated properties except (1501) , are nothing but 
the limit versions of the flow conservations laws. Property (1501) follows from the construction of the random 
assignment, the continuity of x{t), and (1001) . We omit further details. 

(ii) If a;(0) G X^, which implies yi{0) = Pi for each property (1501) (and then (1501) as well) follows from (HOI) 
and (1501) . Then, (l55l) is verified directly using (l49l) . Finally, (154)) follows from (1501) . □ 

Lemma 15. There exists e > 0, such that, uniformly on FSPs with initial states a;(0) G T"* D {||a; — a;*’°|| < 

e}; 

x{t) x*’°, t —)■ cx). (56) 

FSP x(t) = x*’° is the unique invariant FSP, satisfying conditions xo«(0) > 0, Vs. 

Proof. We can assume (without loss of generality) that e is small enough so that a;fc(0) > 0, Vfc G K,. In 
particular, at t = 0, the condition xo^{t) > 0, Vs, holds. Obviously, until the first time r > 0 when this 
condition is violated (r = oo if it is never violated), we have yi{t) = pi, 'ii. It is also easy to see that all time 
ponts 0 < t < r are regular and such that Xk{t) > 0, Vfc G IC. Denote by E{x) the derivative {d/dt)L° {x{t)) 
at a given point x{t) = x. Then, expressions (HOI) and (HOI) for Wki and v^'i hold for our system, and can be 
interpreted the same way. (Recall, however, that now the components xqh are not constant, and therefore 
their derivatives do depend on the rates wo«+ei,i and uo«+ei.i-) Then the expression for S(S) has exactly 
same form as expression (H51) for S(a;) in Sectional 

‘^(^) 'y ^ 'y ^ {pi/k — etXk'^ ^^^{^iXf^Xk' — Gi — — 0- (57) 

The inequality in (l57l) is strict unless k^Xk-eiXk' = hxkXk'-ei for all pairs of edges (fc,i) and {k',i). 
Therefore, S(S) < 0 unless x has a product form representation (|22l) . which in turn is equivalent to a; = x*F, 

Function E{x) is continuous in a neighborhood of x*’° (and in fact at any point such that Xk >0, Vfc G IC). 
Choose ei > 0 small enough so that xj. > 0, k G IC, for all x G X^ {||£C — £c*’°|| < ei}. Then choose 
(5 > 0 such that condition L°{x) — L°{x*'°) < 5 (along with x G X^) implies ||a: — < ei. Finally, 

choose e > 0 small enough so that the maximum of L°{x) — L°(i*’°) over the set X^ D {||a; — < e} 

is less than 5. We see that a trajectory with a;(0) G X^ D {||a; — < e} cannot escape from the set 

X'^ f\{\\x — x*F\\ < ei}, and therefore Xk{t) >0, k gK., for all t > 0. Then, the convergence (1551) holds, and 
it is uniform on a;(0) G X^ D {||a; — < e}, because, for any 0 < di < i5, S(S) is negative and bounded 

away from zero for all x G X^ D {i5i < L°{x) — L°{x*'°) < (5}. 

It is a corollary from the above argument, that there cannot be an invariant FSP x{t) = a;(0) with a;os(0) > 
0, Vs, unless a;(0) = x*’° . (Indeed, a:(0) G X^ necessarily, because if yi{0) ^ Pi then yi{t) cannot be constant. 
Then a;(0) = x*’°, because otherwise F°{x{t)) cannot be constant.) This proves the second statement of 
the lemma. □ 

Lemma 16. There exists e > 0, such that, uniformly on FSPs with initial states £c(0) G X'^ n{||a; —< 

x{t) x*'°, t ^ oo. (58) 

Proof is a slightly generalized version of that of Lemma 1T51 That proof considers FSPs that stay within X^, 
uses the continuity of S(S), and the fact that for x G X^ in a small neighborhood of S(S) < 0 unless 
X = But, S(S) is continuous in a neighborhood of x*F (or any point such that Xk >0, Vfc G 1C), not 
necessarily restricted to X^ . In addition, we know that as long as xoa{t) > 0, Vs, each yi{t) satisfies ODE 
{d/dt)[yi{t) - Pi] = -pi[yi{t) - Pi], and therefore 

yz{t) - Pi = (?/*(0) - p^)e-^'K (59) 


Using these observations, the adjustment of the proof of Lemma [T5] is as follows. We choose small ei > 0, 
then d > 0, then e > 0, exactly as in that proof. Then, using the continuity of 5(a;), along with (1591) . we can 
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choose a sufficiently small £2 > 0, so that a trajectory with a;(0) G {\yi — pi\ < 62 , Vi} fl {||a; — < e} 

cannot escape from the set {\yi — pi\ < € 2 , Vi} fl {||a; — < ei}. Then, the convergence (1551) holds, and 

it is uniform on a;(0) G {\yi — pi\ < £ 2 , Vi} fl {||a; — < £}, because, for any 0 < < i5, there exists a 

small £2 > 0, such that S(S) is negative and bounded away from zero for all x G {\yi — Pi\ < £ 2 , Vi} D {(5i < 
L°{x) — L°{x*'°) < (5}. (Note that the time for FSPs starting in {\yi — pi\ < £ 2 , Vi} (~l {||a; — £c*>°|| < e} to 
reach set {\yi — pi\ < £ 3 , Vi} D {||a; — < e} is uniformly bounded due to (l59l) .l □ 

5.1 Comments on Conjecture [HI local stability, and fixed point argument 

Lemmas 1151 and 1161 formally state properties described informally in Proposition [ 8 ] The sequence of steady- 
states a;’'(oo) is obviously tight. It is easy to see that its any subsequential limit in distribution, a:(oo), is 
such that yi{oo) < pi, Vi, w.p.l. This is because, by comparison with the infinite-server system, Y’'{oo) is 
stochastically dominated by a Poisson random variable with mean Furthermore, again by comparison 
with the infinite-server system, any FSP with 

a;(0) G X°’- = {x € \ ^ ^ kiXk < Pi, Vi G 1} 

s 

stays in X'^’- at all times t. Given these facts, if we would have the (analogous to Lemma 11^ uniform 
convergence property 

x{t) £c*’°, Va;(0) G X°’^, (60) 

this would prove Conjecture |9] (by the same argument as in the proof of Theorem [H). Unfortunately, the 
uniform convergence (1601) does not hold for a general finite-server system. It is very easy to construct a 
counterexample (e.g., for a system with one server type with the configuration set shown of Fig. 1(b) in |15j l 
such that there exists an invariant FSP x{t) = x*, “sitting” at a suboptimal point x* ^ x*’° ^ such that 
y* < Pi, Vi, and therefore such that there is non-zero fraction of customers of each type being blocked. (In 
fact, we believe that a stronger property holds for such a counterexample: the sequence of processes £c’'(-) 
converges in distribution to the invariant FSP x{t) = x*.) This, of course, does not imply that ConjectureH] 
is wrong - it just shows that there is no hope of proving Conjecture |9] based on fluid scale considerations 
alone. 

Lemmas M and M show FSP local stability at the optimal point x*’°, and the fact that a:*’° is the only 
invariant point at which there is no blocking. This strongly suggests that Conjecture[9]is correct, even though, 
as discussed above, it is insufficient for its proof. Still, we note that the local stability is a substantially 
stronger property than a typical “fixed point” argument which is used to “guess” asymptotic properties like 
our Conjecture ini In our case a “fixed point” argument would go as follows: as r — 00 , assume that steady- 
state distributions of server states are asymptotically independent; further assume that a subsequential limit 
of the marginal distribution of a server state is such that the server is empty with non-zero probability; 
under these assumptions, find the set of (limiting) marginal distributions (for each server type), which would 
remain invariant (“fixed”) over time; in our case, this argument leads to finding that the only such possible 
set of marginal distributions is such that the system must be “sitting” at the point x*’°, equal to the one 
defined in this paper. Note that, in essence, the above argument is nothing else but the statement that x*’° 
is the unique invariant point (at which there is no blocking) for FSPs, while local stability properties in 
Lemmas 1151 and 1161 are much stronger. 


6 Discussion 


Proving Conjecture [9] for the finite-server system under GRAND-F is a very interesting and challenging 
subject of future work. As discussed in Section ISlTl fluid-scale analysis alone cannot be sufficient for such a 
proof, because there may exists sub-optimal points, which are invariant for the FSPs. 
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The local stability results for the finite-server system with blocking (Proposition [51 Lemmas [T5] and [T51) hold 
for other variants of the finite-server system as well. Indeed, these results and their proofs only concern 
with the system behavior in the vicinity of equilibrium point, where there are always available servers for 
any customer type. Suppose now that we have a system in which customers are queued instead of blocking 
when there are no available servers for them (or a system where both blocking and queueing are possible). 
Then the local stability results still apply for this system, as long as the assignment rule coincides with 
GRAND-F when there are servers available to arrivals. Further, this suggests that Conjecture [9] is also 
valid for such other variants of the finite-server system, under appropiate versions of GRAND-F. In fact, 
recall that GRAND-F, as defined in this paper, itself can be viewed as an extension of PULL algorithm m 
to systems with packing constraints. PULL algorithm has been defined and proved to be asymptotically 
optimal for very general systems with queueing and/or blocking (but without packing constraints). 

The results of this paper further highlight the universality of GRAND algorithm. For example. Best Fit type 
algorithms are applicable only to the special case of vector packing constraints, where the underlying notion 
of a customer “fitting best into the remaining space” at a server makes sense. When packing constraints 
are more general, Best Fit is not applicable, while GRAND is. Furthermore, inherently. Best Fit requires 
precise information about the current state of each server - this can be a substantial disadvantage in practical 
large-scale systems. GRAND, on the other hand, only needs to know whether a given customer fits into a 
given server or not; this allows a very efficient practical implementation (as discussed in detail in Remark|n|). 
It is possible that versions of Best Fit may perform better than GRAND for systems with vector packing 
constraints. Paper [5] provides some evidence of that. (Although, the algorithm studied in is not a 
“pure” Best Fit, but a Best Fit with randomization, a mixture, in a sense, of Best Fit and GRAND.) 
Studying versions of Best Fit is an interesting subject; it is outside the scope of this paper, which is focused 
on general packing constraints. First Fit is another approach to packing; algorithms of this type use fixed 
preordering of servers and place each customer into the first one where it can fit. Such algorithms are easily 
implementable and apply to general packing constraints. Note that GRAND can be viewed as a First Fit 
with random uniform reordering of servers before each customer placement. If the order of servers has to 
chosen and fixed a priori, as “pure” First Fit requires, the question arises on how to do it when the servers 
are heterogeneous, as in our model. Exploring variants of First Fit may be another subject of future research. 
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